Overview

This parameterized markdown file is a standarized exploratory analysis report, examining the data type, structure, missing values, unique values, summary statistics for continuous and categorical variables contained within a data.frame object.

This does not support date-time variables such as POSIXct or Date, but will eventually move to support all data types typically found in data.frame in the R environment.

Data Structure

The input file mtcars is a data.frame. It contains 32 rows and 11 columns. The table below summarizes the data structure of the object. The column names are displayed in the code chunk below.

Dataset Dimensions
Name Number of Columns Number of Rows Number of Elements Memory Allocation
mtcars 11 32 352 7 Kb
Dataset Variable Class and Missing Values
Variable Class missing
mpg numeric 0
cyl numeric 0
disp numeric 0
hp numeric 0
drat numeric 0
wt numeric 0
qsec numeric 0
vs numeric 0
am numeric 0
gear numeric 0
carb numeric 0

Missing Values, Data Type, Unique Values

Table

The table below displays the data type, counts/proportion of missing values NA, and whether the data is labelled or not.

Plot

Unique Values

Variable Content

This section contains an analysis of the variable content composition by describing the type of data content that exist based on numeric values, character values, punctuation or symbols, and blank space. This is especially useful when data types are not identified or misclassified.

Only Numbers

only_numbs reports the number of rows that contain only digits "^[0-9]{1,}$" and numpercent is the respective percentage. digits_min and digits_max display the length of the digits found in the variable, digits_eq is a logical statment of whether the mix and max are equal.

Only Character

only_char reports the number of rows that contain only characters "^[A-z]{1,}" and charpercent is the respective percentage.

Only Punctuation

only_punc reports the number of rows that contain only punctuation or symbols "^\\W+$" and puncpercentage is the respective percentage.

Only Blanks

only_blanks reports the number of rows that contain a blank (“”) "^\\s{0}$" and blankspercent is the respective percentage. only_wspace reports the number of rows that contain any white-space, including a blank "^\\s*$" and wspacepercent is the respective percentage.

All Types

Variable Content Plot

Generates an interactive plotly bar plot of variable content by raw counts.

Continuous Variables

‘mpg’

Summary Statistics for mpg
Min Mean Median Max Standard Deviation
10.4 20.09 19.2 33.9 6.03

‘cyl’

Summary Statistics for cyl
Min Mean Median Max Standard Deviation
4 6.19 6 8 1.79

‘disp’

Summary Statistics for disp
Min Mean Median Max Standard Deviation
71.1 230.72 196.3 472 123.94

‘hp’

Summary Statistics for hp
Min Mean Median Max Standard Deviation
52 146.69 123 335 68.56

‘drat’

Summary Statistics for drat
Min Mean Median Max Standard Deviation
2.76 3.6 3.7 4.93 0.53

‘wt’

Summary Statistics for wt
Min Mean Median Max Standard Deviation
1.51 3.22 3.33 5.42 0.98

‘qsec’

Summary Statistics for qsec
Min Mean Median Max Standard Deviation
14.5 17.85 17.71 22.9 1.79

‘vs’

Summary Statistics for vs
Min Mean Median Max Standard Deviation
0 0.44 0 1 0.5

‘am’

Summary Statistics for am
Min Mean Median Max Standard Deviation
0 0.41 0 1 0.5

‘gear’

Summary Statistics for gear
Min Mean Median Max Standard Deviation
3 3.69 4 5 0.74

‘carb’

Summary Statistics for carb
Min Mean Median Max Standard Deviation
1 2.81 2 8 1.62


R Session Info

## R version 3.5.1 (2018-07-02)
## Platform: x86_64-apple-darwin15.6.0 (64-bit)
## Running under: OS X El Capitan 10.11.6
## 
## Matrix products: default
## BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## attached base packages:
## [1] grid      stats     graphics  grDevices utils     datasets  methods  
## [8] base     
## 
## other attached packages:
##  [1] plotly_4.8.0       bindrcpp_0.2.2     htmlwidgets_1.3   
##  [4] DT_0.4             kableExtra_0.9.0   gridExtra_2.3     
##  [7] ggfortify_0.4.5    scales_1.0.0       ggplot2_3.1.0     
## [10] stringr_1.3.1      forcats_0.3.0      data.table_1.11.4 
## [13] purrr_0.2.5        tidyr_0.8.1        haven_1.1.2       
## [16] telegram.bot_2.2.0 dplyr_0.7.8        rmarkdown_1.11    
## [19] here_0.1          
## 
## loaded via a namespace (and not attached):
##  [1] Rcpp_1.0.0         assertthat_0.2.0   rprojroot_1.3-2   
##  [4] digest_0.6.18      mime_0.6           R6_2.3.0          
##  [7] plyr_1.8.4         backports_1.1.2    evaluate_0.12     
## [10] httr_1.4.0         highr_0.7          pillar_1.3.1      
## [13] rlang_0.3.1.9000   lazyeval_0.2.1     curl_3.3          
## [16] rstudioapi_0.9.0   labeling_0.3       webshot_0.5.1     
## [19] readr_1.3.1        munsell_0.5.0      shiny_1.2.0       
## [22] compiler_3.5.1     httpuv_1.4.5.1     xfun_0.4          
## [25] pkgconfig_2.0.2    htmltools_0.3.6    tidyselect_0.2.5  
## [28] tibble_2.0.1       viridisLite_0.3.0  crayon_1.3.4      
## [31] dbplyr_1.2.2       withr_2.1.2        later_0.7.5       
## [34] jsonlite_1.6       xtable_1.8-3       gtable_0.2.0      
## [37] DBI_1.0.0          magrittr_1.5       stringi_1.2.4     
## [40] promises_1.0.1     xml2_1.2.0         RColorBrewer_1.1-2
## [43] tools_3.5.1        Cairo_1.5-9        glue_1.3.0        
## [46] hms_0.4.2          crosstalk_1.0.0    yaml_2.2.0        
## [49] colorspace_1.4-0   rvest_0.3.2        knitr_1.21        
## [52] bindr_0.1.1



Process Time: 0.2 minutes
R version 3.5.1 (2018-07-02)